On solving Ordinary Differential Equations using Gaussian Processes
نویسنده
چکیده
We describe a set of Gaussian Process based approaches that can be used to solve non-linear Ordinary Differential Equations. We suggest an explicit probabilistic solver and two implicit methods, one analogous to Picard iteration and the other to gradient matching. All methods have greater accuracy than previously suggested Gaussian Process approaches. We also suggest a general approach that can yield error estimates from any standard ODE solver. 1 The Initial Value Problem Given an Ordinary Differential Equation (ODE) with known initial condition x(t1) = x1 d dt x(t) = f(t, x(t), θ) (1) the Initial Value Problem (IVP) is to find the differentiable function x(t) over some specified time interval t ∈ [t1, tT ] that satisfies the ODE subject to the initial value condition. In general x(t) is a vector so that higher order scalar ODEs can be embedded as first order vector ODEs [8]. In general this problem requires an approximate numerical solution and we denote the approximation at time tn to x(tn) by xn. There is a vast literature on this topic (see [8] for an introduction) and several families of techniques that can be applied such as one-step methods, multistep methods, fixed and variable step length, and implicit and explicit approaches. In general there is no single ‘best’ method with the methods having different properties in terms of numerical stability, speed, number of function f evaluations, parallelisabiltiy etc. Recently there has been interest in the machine learning community in the application of Gaussian Processes for the IVP [9, 5, 2] and estimation of ODE parameters given potentially noisy observations D [1, 3, 10]. An ideal approach to parameter estimation is based on Bayesian Numerical Integration. Writing t1, . . . , tN for the times at which data is observed and x(tn) for the true solution to the IVP at those times, p(θ, x2:N |x1,D) ∝ p(D|x2:N )p(x2:N |x1, θ) (2) where the term p(x2:N |x1, θ) represents a distribution over true solutions given the initial value. Generally this otherwise ideal approach is problematic since classical IVP solution techniques do not produce a distribution over solutions x2:N , meaning that the uncertainty (which must exist due to the numerical approximation) in the solution is not correctly accounted for. The approaches in [1, 3, 10] use Gaussian Processes to circumvent the requirement to produce p(x2:N |x1, θ) and work by implicitly fitting an alternative function to the data whose gradient must match the gradient specified by the ODE at the observation times. Whilst these recent parameter estimation approaches that avoid the requirement to find p(x2:N |x1, θ) look promising, it nevertheless remains of interest to find distributions p(x2:N |x1, θ) since these can be used to solve the IVP and characterise uncertainty in the solution. This is the focus of this work, in which we assume that the parameters θ of the ODE are known, but an estimate of uncertainty in the solution is required. 1The more general boundary value problems can also be addressed using related approaches. 2We therefore assume θ is fixed and known and drop the notational conditioning on θ throughout. 1 ar X iv :s ub m it/ 10 45 59 1 [ st at .M E ] 1 7 A ug 2 01 4 1.1 Gaussian Processes and Linear ODEs In the case that f is linear in x(t), the solution involves integrals of matrix exponentials, which generally cannot be computed in closed form but can be approximated using for example the Magnus expansion [8]. An alternative approximate approach that avoids explicit integration and generalises the solution to the case of additive Gaussian noise is to assume that x(t) follows a Gaussian Process (GP) with covariance function C(t, t′), see for example [4]. Then writing f(t, x(t)) in terms of a matrix L, time-varying term, φ(t) and Gaussian noise (t) f(t, x(t)) = Lx(t)− φ(t) + (t) we have y(t) ≡ ẋ(t)− Lx(t)− (t) = φ(t), where ẋ(t) ≡ d dt x(t) (3) Since x(t) is assumed a GP, and y(t) is a linear function of x(t) and (t), then y(t) is also a GP. The covariance function of this new process is straightforward to obtain using the standard rules, see [7]. For example, the covariance terms involving ẋ are simply obtained by differentiating the covariance function of x: 〈ẋ(t)x(t)〉p(ẋ,x) = ∂ ∂t C(t, t′), 〈ẋ(t)ẋ(t)〉p(ẋ) = ∂ ∂t∂t′ C(t, t′) (4) where 〈f(x)〉p(x) denotes expectation of the function f(x) with respect to the distribution p(x). Given then observations yn ≡ φ(tn) at the given observation times and any boundary or initial conditions on x and ẋ, then x(t) is a GP whose mean and covariance function is given by the standard Gaussian conditioning formulae p(x|y) = N ( x μx + CxyC −1 yy (y − μy), Cxx − CxyC yy Cyx ) (5) In this way we can globally approximately solve the IVP (or BVP), giving a Gaussian distribution over the solution approximation p(x2:N |x1,D). Whilst this method has cubic complexity in the number of points N that we need to evaluate the function at, this will typically be much smaller than the number of timepoints used in a standard ODE solver, provided that the solution is sufficiently smooth. 1.2 Skilling’s IVP approach for non-linear ODEs In the case that f is not linear in x, the problem is generally much more complex. One approach would be to assume that the approximation follows a GP and perform local linearisation, analogous to Exponential Integrators, see for example [6]. However, recent work in this area [5, 2] has developed the suggestion by Skilling [9], which we outline below. The fundamental quantity of interest in Skilling’s [9] approach for the IVP is the set of derivatives ẋ ≡ ẋ1, . . . , ẋN at specified ‘knotpoints’ t1, . . . , tN . In [5, 2] a GP is assumed for the approximate solution x1:N . We start with the known initial state x1 and compute its derivative 4 ẋ1 = f(x1). This is the only point and derivative that we know with certainty. One can interpret this as an observation of the derivative with zero observation error, σ 1 = 0. We assume a zero mean GP with known covariance function . Using the GP we can form a distribution for the solution at the next knotpoint pGP (x2|x1, ẋ1) (6) We now sample a value for x2 from (6) and subsequently compute the derivative ẋ2 = f(x2). Note that both the value x2 and derivative ẋ2 will not necessarily correspond to the true solution and its derivative. Because of this, only the derivative is retained and the interpretation is that one has observed the derivative ẋ2 with measurement error σ 2 specified by the variance of pGP (ẋ2|x1, ẋ1). Given x1, ẋ1 = f(x1), ẋ2 = f(x2) and the corresponding variances on these observations σ 1:2, one now forms the GP prediction pGP (x3|, x1, ẋ1 = f(x1), ẋ2 = f(x2), σ 1:2) (7) As before one then samples a value for x3 and subsequently computes the derivative observation ẋ3 = f(x3) which is assumed to be measured with observation variance obtained from pGP (ẋ3|ẋ1:2, σ 1:2). One continues
منابع مشابه
On solving ordinary differential equations of the first order by updating the Lagrange multiplier in variational iteration method
In this paper, we have proposed a new iterative method for finding the solution of ordinary differential equations of the first order. In this method we have extended the idea of variational iteration method by changing the general Lagrange multiplier which is defined in the context of the variational iteration method.This causes the convergent rate of the method increased compared with the var...
متن کاملConvergence of the multistage variational iteration method for solving a general system of ordinary differential equations
In this paper, the multistage variational iteration method is implemented to solve a general form of the system of first-order differential equations. The convergence of the proposed method is given. To illustrate the proposed method, it is applied to a model for HIV infection of CD4+ T cells and the numerical results are compared with those of a recently proposed method.
متن کاملNumerical solution of second-order stochastic differential equations with Gaussian random parameters
In this paper, we present the numerical solution of ordinary differential equations (or SDEs), from each order especially second-order with time-varying and Gaussian random coefficients. We indicate a complete analysis for second-order equations in special case of scalar linear second-order equations (damped harmonic oscillators with additive or multiplicative noises). Making stochastic differe...
متن کاملSolving Differential Equations by Using a Combination of the First Kind Chebyshev Polynomials and Adomian Decomposition Method
In this paper, we are going to solve a class of ordinary differential equations that its source term are rational functions. We obtain the best approximation of source term by Chebyshev polynomials of the first kind, then we solve the ordinary differential equations by using the Adomian decomposition method
متن کاملStable Gaussian radial basis function method for solving Helmholtz equations
Radial basis functions (RBFs) are a powerful tool for approximating the solution of high-dimensional problems. They are often referred to as a meshfree method and can be spectrally accurate. In this paper, we analyze a new stable method for evaluating Gaussian radial basis function interpolants based on the eigenfunction expansion. We develop our approach in two-dimensional spaces for so...
متن کاملInvariant functions for solving multiplicative discrete and continuous ordinary differential equations
In this paper, at first the elemantary and basic concepts of multiplicative discrete and continous differentian and integration introduced. Then for these kinds of differentiation invariant functions the general solution of discrete and continous multiplicative differential equations will be given. Finaly a vast class of difference equations with variable coefficients and nonlinear difference e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1408.3807 شماره
صفحات -
تاریخ انتشار 2014